Multi-omic analysis identifies metabolic biomarkers for the early detection of breast cancer and therapeutic response prediction

Summary Reliable blood-based tests for identifying early-stage breast cancer remain elusive. Employing single-cell transcriptomic sequencing analysis, we illustrate a close correlation between nucleotide metabolism in the breast cancer and activation of regulatory T cells (Tregs) in the tumor microenvironment, which shows distinctions between subtypes of patients with triple-negative breast cancer (TNBC) and non-TNBC, and is likely to impact cancer prognosis through the A2AR-Treg pathway. Combining machine learning with absolute quantitative metabolomics, we have established an effective approach to the early detection of breast cancer, utilizing a four-metabolite panel including inosine and uridine. This metabolomics study, involving 1111 participants, demonstrates high accuracy across the training, test, and independent validation cohorts. Inosine and uridine prove predictive of the response to neoadjuvant chemotherapy (NAC) in patients with TNBC. This study deepens our understanding of nucleotide metabolism in breast cancer development and introduces a promising non-invasive method for early breast cancer detection and predicting NAC response in patients with TNBC.


Figure S2 .
Figure S2.Expression profiles of metabolic pathways in scRNA-seq analyses and enrichment scores in TCGA dataset, related to Figure 2 and Tables S1-3.(A) Identification of cell clusters using canonical markers for different cell types, as shown in the UMAP plot.(B) Heatmap displaying normalized large-scale CNVs in indicated cell types from BC patients.Reference cells are normal epithelial cells from 4 NC (upper panel).Large-scale CNV are observed in epithelial cells in BC samples (lower panel).Red indicates high CNV levels, while blue indicates low CNV levels.(C) The heatmap showing the enrichment scores of the metabolic pathways in the TCGA dataset.Pathways are labeled with name and sorted by t-values.The top 10 enriched KEGG terms are presented.(D-E) Metabolic pathway expression profiles in normal breast tissues (left panel) and BC (right panel).For each pathway, the fold change in epithelial cells (normal tissues, n = 4) or tumor cells (BC, n=5) was calculated relative to other cell types and corrected for sample of origin.Pathways were ordered by log-fold change in normal epithelial and tumor cells, respectively.

Figure S3 .
Figure S3.Validation of scRNA-seq metabolism findings in six BC samples from an external scRNA-seq dataset, related to Figure 2 and Tables S1. (A) Identification of cell populations in human breast tissues.The UMAP visualization depicts 15,116 cells from 6 breast tumor samples and 19,496 cells from 4 adjacent nonmalignant breast tissues, revealing eight major cell clusters labeled by cell type.Each dot represents an individual cell and is color coded accordingly.(B) Canonical markers for the different cell types were used to identify cell clusters, as illustrated in the UMAP plot.(C) Average proportion of different cell types in normal breast samples (n = 4) and BC tissues (n = 6).(D) Heatmap of normalized large-scale CNVs.Normalized large-scale copy number variations (CNVs) in specific cell types from BC patients are presented.Reference cells are normal epithelial cells from 4 NC samples (upper panel), whereas large-scale CNVs are observed in epithelial cells in BC cancer samples (lower panel).High CNV levels are indicated in red, while low levels are indicated in blue.(E) AUCell analysis was performed on up-regulated and down-regulated pathways.Metabolismrelated gene sets are labeled by name and sorted by t-values.The top 10 upregulated and downregulated metabolic pathways are displayed.(F-G) Metabolic pathway expression profiles in normal breast tissues (left panel) and BC (right panel).For each pathway, the fold change in epithelial cells (normal tissues, n = 4) or tumor cells (BC, n=6) was calculated relative to other cell types and corrected for sample of origin.Pathways were ordered by log-fold change in normal epithelial and tumor cells, respectively.(H) Expression changes in metabolic pathways in tumor cells are depicted.These pathways are ordered by log-fold change in normal epithelial cells (upper panel) and tumor cells (lower panel), respectively.Solid lines connect the same pathway in normal and BC tissues, with red lines highlighting nucleotide metabolism.Each node represents a single pathway, and node color reflects its expression levels in normal epithelial cells or tumor cells compared to other cell types.

Figure S4 .
Figure S4.Details of clustering results of nineteen BC samples from an external scRNA-seq dataset, related to Figure 2 and Tables S2. (A) The UMAP visualization of 64,738 single cells from nineteen primary breast tumors including 7 TNBCs, 8 HR+/HER2-and 4 HER2+, with eight major cell clusters identified and labeled.(B) Canonical markers for different cell types were used to identify cell clusters, as shown in the UMAP plot.(C) Shown is the UMAP visualization of various T-cell types, including 23,958 single cells with five major T-cell clusters identified and labeled.(D) Canonical markers for various T-cell types were used to identify cell clusters, as shown in the UMAP plot.

Figure S5 .
Figure S5.Expression of purine metabolism (A) and pyrimidine metabolism (B) in each cell type among three subtypes of nineteen BC patients from an external scRNA-seq dataset (ttest), related to Figure 2 and Tables S2.

Figure S6 .
Figure S6.The performance evaluation of the PLS-DA models in untargeted metabolomic analysis, related to Figure 3. (A and B) Cross validation results.The selected performance measure-Q2 shows that the fourcomponent model is best for untargeted metabolomics data in ESI+ mode (A) and in ESI-mode (B) (indicated by a red star).(C and D) Permutation test results.The histogram illustrates that the observed statistic, derived from the original data, falls outside the null distribution generated by permutations of the data in ESI+ mode (C) and in ESI-mode (D).The p value is significant (<0.001).

Figure S7 .
Figure S7.Validation of metabolite marker identities using chemical standards in DDA mode (untargeted metabolomics), related to Figure 4. (A-E) Comparison of metabolites in plasma samples (upper panel) with chemical standards (lower panel).MS2 spectra and retention times of each metabolite are indicated on the respective panel for inosine in ESI+ mode (A) and in ESI-mode (B), as well as for uridine (C), phenylalanine (D), and threonine (E) in ESI-mode.

Figure S8 .
Figure S8.Validation of metabolite marker identities using chemical standards in MRM mode (targeted metabolomics), related to Figure 4 and TablesS7-8.(A and C) Extracted ion chromatograms (XICs) for four selected metabolites in MRM results of chemical standards (A) and plasma samples (C).(B) XICs of SIL-IS for four selected metabolites in MRM results.

Figure S9 .
Figure S9.Comparative XICs of four selected metabolites in MRM results before and after method optimization, related to Figure 4. (A) XICs of four selected metabolites under initial conditions.(B) XICs of four selected metabolites under optimized conditions.